-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
DOC: add sections about big new features (CoW, string dtype) to 3.0.0 whatsnew notes #61724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
DOC: add sections about big new features (CoW, string dtype) to 3.0.0 whatsnew notes #61724
Conversation
|
||
Starting with pandas 3.0, a dedicated string data type is enabled by default | ||
(backed by PyArrow under the hood, if installed, otherwise falling back to | ||
NumPy). This means that pandas will start inferring columns containing string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NumPy). This means that pandas will start inferring columns containing string | |
``object``-dtype backed by NumPy). This means that pandas will start inferring columns containing string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With that edit it seems to suggest that it falls back to "object" dtype (as we currently use)? But it still has a StringDtype, just using numpy instead of pyarrow under the hood. So it is "backed by NumPy object dtype under the hood", so something like:
NumPy). This means that pandas will start inferring columns containing string | |
being backed by NumPy ``object``-dtype). This means that pandas will start inferring columns containing string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup that edit is clearer. Just didn't want to make this ambiguous to suggest it could be the new NumPy string type (which it isnt)
how pandas operates with respect to copies and views. A summary of the changes: | ||
|
||
1. The result of *any* indexing operation (subsetting a DataFrame or Series in any way, | ||
i.e. including accessing a DataFrame column as a Series) or any method returning a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i.e. including accessing a DataFrame column as a Series) or any method returning a | |
e.g. accessing a DataFrame column as a Series) or any method returning a |
We don't actually yet list the bigger features (string dtype, CoW, no silent downcasting) in the 3.0.0 whatsnew page, so starting to do that here.
Already pushed a section about string dtype, will further add a section about CoW and the downcasting.